-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[LLD] Add flag to force PLT entries to have a BTI #168365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-lld-elf Author: Gergely Bálint (bgergely0) ChangesAdded the Note that always adding a Full diff: https://github.com/llvm/llvm-project/pull/168365.diff 5 Files Affected:
diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 2a97df4785ecb..a9702fde05b9b 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -1186,8 +1186,8 @@ void AArch64BtiPac::writePlt(uint8_t *buf, const Symbol &sym,
// address may escape if referenced by a direct relocation. If relative
// vtables are used then if the vtable is in a shared object the offsets will
// be to the PLT entry. The condition is conservative.
- bool hasBti = btiHeader &&
- (sym.hasFlag(NEEDS_COPY) || sym.isInIplt || sym.thunkAccessed);
+ bool hasBti = btiHeader && (sym.hasFlag(NEEDS_COPY) || sym.isInIplt ||
+ sym.thunkAccessed || ctx.arg.forceBtiPlt);
if (hasBti) {
memcpy(buf, btiData, sizeof(btiData));
buf += sizeof(btiData);
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index 8ec5a2c04e71c..b1060bf165ac8 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -330,6 +330,7 @@ struct Config {
bool exportDynamic;
bool fixCortexA53Errata843419;
bool fixCortexA8;
+ bool forceBtiPlt = false;
bool formatBinary = false;
bool fortranCommon;
bool gcSections;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 8647752be31fe..14c6b6f8a1dad 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1486,6 +1486,7 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
ctx.arg.nmagic = args.hasFlag(OPT_nmagic, OPT_no_nmagic, false);
ctx.arg.noinhibitExec = args.hasArg(OPT_noinhibit_exec);
ctx.arg.nostdlib = args.hasArg(OPT_nostdlib);
+ ctx.arg.forceBtiPlt = args.hasArg(OPT_bti_plt);
ctx.arg.oFormatBinary = isOutputFormatBinary(ctx, args);
ctx.arg.omagic = args.hasFlag(OPT_omagic, OPT_no_omagic, false);
ctx.arg.optRemarksFilename = args.getLastArgValue(OPT_opt_remarks_filename);
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 75184de496448..2059e0c62d748 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -212,6 +212,8 @@ defm eh_frame_hdr: B<"eh-frame-hdr",
def emit_relocs: F<"emit-relocs">, HelpText<"Generate relocations in output">;
+def bti_plt: FF<"force-bti-plt">, HelpText<"Force all PLT entries to have BTI">;
+
def enable_new_dtags: F<"enable-new-dtags">,
HelpText<"Enable new dynamic tags (default)">;
diff --git a/lld/test/ELF/aarch64-feature-bti.s b/lld/test/ELF/aarch64-feature-bti.s
index 8d7c1f2826c17..191e3710054f8 100644
--- a/lld/test/ELF/aarch64-feature-bti.s
+++ b/lld/test/ELF/aarch64-feature-bti.s
@@ -262,6 +262,13 @@
# REPORT-ERR: error: unknown -z bti-report= value: u{{$}}
# REPORT-EMPTY:
+## Force all PLT entries to start with BTI with the force-bti-plt command line option.
+# RUN: ld.lld %t.o %t2.o -z force-bti --force-bti-plt %t.so -o %tforcebtiplt.exe
+# RUN: llvm-objdump --no-print-imm-hex -d --mattr=+bti --no-show-raw-insn %tforcebtiplt.exe | FileCheck --check-prefix=FORCE-BTI %s
+
+# FORCE-BTI: 00000000002103a0 <func2@plt>:
+# FORCE-BTI-NEXT: 2103a0: bti c
+
.section ".note.gnu.property", "a"
.long 4
.long 0x10
|
|
@llvm/pr-subscribers-lld Author: Gergely Bálint (bgergely0) ChangesAdded the Note that always adding a Full diff: https://github.com/llvm/llvm-project/pull/168365.diff 5 Files Affected:
diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 2a97df4785ecb..a9702fde05b9b 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -1186,8 +1186,8 @@ void AArch64BtiPac::writePlt(uint8_t *buf, const Symbol &sym,
// address may escape if referenced by a direct relocation. If relative
// vtables are used then if the vtable is in a shared object the offsets will
// be to the PLT entry. The condition is conservative.
- bool hasBti = btiHeader &&
- (sym.hasFlag(NEEDS_COPY) || sym.isInIplt || sym.thunkAccessed);
+ bool hasBti = btiHeader && (sym.hasFlag(NEEDS_COPY) || sym.isInIplt ||
+ sym.thunkAccessed || ctx.arg.forceBtiPlt);
if (hasBti) {
memcpy(buf, btiData, sizeof(btiData));
buf += sizeof(btiData);
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index 8ec5a2c04e71c..b1060bf165ac8 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -330,6 +330,7 @@ struct Config {
bool exportDynamic;
bool fixCortexA53Errata843419;
bool fixCortexA8;
+ bool forceBtiPlt = false;
bool formatBinary = false;
bool fortranCommon;
bool gcSections;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 8647752be31fe..14c6b6f8a1dad 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1486,6 +1486,7 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
ctx.arg.nmagic = args.hasFlag(OPT_nmagic, OPT_no_nmagic, false);
ctx.arg.noinhibitExec = args.hasArg(OPT_noinhibit_exec);
ctx.arg.nostdlib = args.hasArg(OPT_nostdlib);
+ ctx.arg.forceBtiPlt = args.hasArg(OPT_bti_plt);
ctx.arg.oFormatBinary = isOutputFormatBinary(ctx, args);
ctx.arg.omagic = args.hasFlag(OPT_omagic, OPT_no_omagic, false);
ctx.arg.optRemarksFilename = args.getLastArgValue(OPT_opt_remarks_filename);
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 75184de496448..2059e0c62d748 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -212,6 +212,8 @@ defm eh_frame_hdr: B<"eh-frame-hdr",
def emit_relocs: F<"emit-relocs">, HelpText<"Generate relocations in output">;
+def bti_plt: FF<"force-bti-plt">, HelpText<"Force all PLT entries to have BTI">;
+
def enable_new_dtags: F<"enable-new-dtags">,
HelpText<"Enable new dynamic tags (default)">;
diff --git a/lld/test/ELF/aarch64-feature-bti.s b/lld/test/ELF/aarch64-feature-bti.s
index 8d7c1f2826c17..191e3710054f8 100644
--- a/lld/test/ELF/aarch64-feature-bti.s
+++ b/lld/test/ELF/aarch64-feature-bti.s
@@ -262,6 +262,13 @@
# REPORT-ERR: error: unknown -z bti-report= value: u{{$}}
# REPORT-EMPTY:
+## Force all PLT entries to start with BTI with the force-bti-plt command line option.
+# RUN: ld.lld %t.o %t2.o -z force-bti --force-bti-plt %t.so -o %tforcebtiplt.exe
+# RUN: llvm-objdump --no-print-imm-hex -d --mattr=+bti --no-show-raw-insn %tforcebtiplt.exe | FileCheck --check-prefix=FORCE-BTI %s
+
+# FORCE-BTI: 00000000002103a0 <func2@plt>:
+# FORCE-BTI-NEXT: 2103a0: bti c
+
.section ".note.gnu.property", "a"
.long 4
.long 0x10
|
|
I'm interested in reviewer's thoughts on the security implications: I'd say that being able to call all PLT entries is not a significant reduction in the security guarantees provided by BTI. (Especially since this is the default in the GNU toolchain.) Thoughts? |
|
My thoughts on security. The PLT sequence is itself an indirect jump, so a target function defined in the same executable/shared-library that the PLT entry is transferring control to must have a BTI compatible instruction. If an attacker has an arbitrary write primitive then they could just target the function in question directly. Adding a BTI is not a significant loss in that case. If the PLT entry is targeting another executable/shared-library then the destination may not have a BTI compatible instruction (BTI property is per shared-object not per program), and with lazy loading the attacker may not know the address of the destination (but they do know the address of the PLT). Not putting a BTI at the start of the PLT entry for "imported" symbols does offer some additional protection. It is difficult to quantify how much effect on security this would have in practice. Given that on most systems using BTI, the whole program (executable and shared-objects) are likely to have BTI enabled, I suspect not much. One thing that could be measured is whether the addition of an additional BTI instruction in all PLT sequences has any measurable affect on performance. My expectation is that this will be lost in the noise of an out-of-order implementation, but on the simplest in-order implementations like the A320 it may be noticeable if loops are involved. Another factor to consider is how much tool implementation detail BOLT is willing to depend on. For example GNU ld happens to put BTI at the start of each PLT entry, but are there any guarantees that will continue? Perhaps this should be written into the ELF conventions https://github.com/ARM-software/abi-aa/tree/main/baabielf64 if BOLT is considering this ABI. |
|
Thanks for the feedback Peter!
Yes, this is in line with what I imagined.
thanks, good point to consider!
We can easily check if the PLTs are generated according to expectations in BOLT, so it wouldn't create binaries with incorrect BTI property. The reason I still opened this PR is that while technically speaking this should be solvable in BOLT, it would definately be a much more involved change than this (+7 line code change, and +6 line test). |
|
Personally I would prefer not to have a command-line option as I think it would be easily missed by users. For example if I'm using BOLT with -mbranch-protection then I must remember to add a new flag. I'd prefer that we either choose to move LLD to always put BTI, or BOLT assumes that it has to deal with a PLT entry without one. Would like to get some feedback from other reviewers on that. Looking at GNU ld in more detail it looks like it can generate PLT[N] entries without a BTI in some circumstances. This might tip the scales towards BOLT having to handle PLT entries without one, or else we'll need a corresponding binutils change. Checking back through the history. When I first wrote the patch in https://reviews.llvm.org/D62609 I had a BTI on each entry with a FIXME in the code to optimise later. I found this line in the review: I can confirm this with something like: For a trivial shared library with two functions f and g |
|
Thanks. I missed this part. Rechecking now: it's also not true that for BTI executables, PLTs always have BTIs. In smaller examples I don't even see nop placeholders. I assume unless at least one PLT is called indirectly, they are all without BTI to save space. (This is still BFD.) Conclusion: however tempting this small change was, it is insufficient to solve the problem it aimed to solve. Thanks for taking a look, and sharing your insights @smithp35! |

Added the
--force-bti-pltflag to make sure the PLTs emitted by AArch64BtiPac have aBTI clanding pad regardless of indirect calls inside the linked binary.Post-link optimizations may require adding new thunks. Having this flag would simplify these optimizations by requiring less patching by tools.
Note that always adding a
BTI cto PLT stubs is the default behaviour for GNU BFD.